专利摘要:
mode-dependent scanning of coefficients of a block of video data. this description describes apparatus and methods for encoding coefficients associated with a block of video data. in one example, a method may comprise selecting a scan order for the coefficients based on an intracoding mode used to predict the video data block and a transformation block size used in transforming the video data block , and generating a syntax element to communicate the selected scan order for the video data block.
公开号:BR112013015895B1
申请号:R112013015895-6
申请日:2011-12-14
公开日:2021-07-27
发明作者:Yunfei Zheng;Muhammed Zeyd Coban;Joel Sole Rojals;Marta Karczewicz
申请人:Velos Media International Limited;
IPC主号:
专利说明:

[001] This order claims the benefits of the following U.S. provisional orders:
[002] U.S. Provisional Application No. 61/426,372, filed December 22, 2010; U.S. Provisional Application No. 61/426,349, filed December 22, 2010; and U.S. Provisional Application No. 61/436,835, filed January 27, 2011, the entire contents of each of which are incorporated herein by reference. Field of Invention
[003] This description relates to block-based video encoding techniques used to compress video data and more particularly to scanning techniques used to serialize the video block data during the encoding process. Description of Prior Art
[004] Digital video capabilities can be incorporated into a wide variety of video devices, including digital televisions, digital direct broadcast systems, wireless communication devices such as wireless telephone sets, wireless broadcast systems, digital assistants personal computers (PDAs), laptop or desktop computers, tablet computers, digital cameras, digital recording devices, video game devices, video game consoles, personal multimedia devices, and the like. Such video devices may implement video compression techniques such as those described in MPEG-2, MPEG-4 or ITU-T H.264/MPEG-4, part 10, Advanced Video Coding (AVC) in order to compress video data. Video compression techniques perform spatial and/or temporal prediction to reduce or remove the inherent redundancy of video sequences. New video standards, such as the High Efficiency Video Coding (HEVC) standard being developed by the “Joint Collaboration Team - Video Coding” (JCTVC), which is a collaboration between MPEG and ITU-T, continue to emerge and develop. The emerging HEVC pattern is sometimes referred to as H.265.
[005] These and other video encoding standards and techniques use block-based video encoding. Block-based video encoding techniques divide the video data in a video frame (or part of it) into video blocks and then encode the video blocks using prediction block-based compression techniques. Video blocks can also be divided into video block tabs. Video blocks (or divisions thereof) may be referred to as encoded units (CUs) and may be encoded using specific video encoding techniques in addition to general data compression techniques.
[006] With the emerging HEVB standard, larger coded units (LCUs) can be divided into smaller CUs according to a quadtree division scheme. CUs can be predicted based on so-called prediction units (PUs), which can have split sizes corresponding to the size of the CUs or smaller than the size of the CUs, so that multiple PUs can be used to predict a given CU. CUs can be intracoded based on prediction data within the same frame or slice in order to exploit spatial redundancy within a video frame. Alternatively, CUs can be intercoded based on prediction data from another frame or slice, in order to exploit temporal redundancy across frames of a video sequence. After predictive coding (intra or inter), transform coding can then be performed, such as discrete cosine transforms (DCT), integer transforms, or the like. With HEVC, transform coding can take place with respect to transform units (TUs), which can also have variable transform sizes in the HEVC pattern. Quantizing transform coefficients, scanning the quantized transform coefficients, and entropy coding can also be performed. Syntax information is signaled with encoded video data, for example, in a video slice header or a video block header, in order to tell the decoder how to decode the video data. Invention Summary
[007] This description describes techniques in which different scan orders are defined and used for different intraprediction modes based on the intraprediction mode type and the transform block size used in the transform of a given block of transform coefficients. In one example, selecting a scan order can be between horizontal, vertical and raster scan orders, although other scan orders may also be supported. A suitable scan order, for example towards a rate distortion metric for encoded video data, can be decided by the encoder by searching between zigzag, horizontal, and vertical scan orders and comparing the results of coding. The best scan order can be selected and used for encoding and then signaled within the bitstream (eg as a block-level syntax element) to the decoder. The performance improvements that can result from using different scan orders can add complexity to the encoder due to the exhaustive search for the best scan order. However, to balance the added complexity, the number of possible scan orders can be limited, and a switchable signaling scheme can be used to index the possible scan orders for the purpose of signaling the selected scan order for a block of video. Both a fixed example and a switchable example of the techniques are explained in more detail below.
[008] In one example, this description describes a method of encoding coefficients associated with a block of video data. The method comprises selecting a scan order for the coefficients based on an intracoding mode used to predict the video data block and a transform block size used in transforming the video data block, and generating a syntax element to communicate the selected scan order for the video data block.
[009] In another example, this description describes a method of decoding coefficients associated with a block of video data. The method comprises receiving a syntax element with the video data block, where the syntax element defines a scan order of a set of higher scan order candidates, defining the higher scan order candidate set with based on one or both of an intracoding mode used to predict the video data block and a transform block size used in transforming the video data block, and reverse scanning the video data block from a representation serialized from the video data block to a two-dimensional representation of the video data block based on the syntax element with respect to the defined set of higher scan order candidates.
[010] In another example, this description describes a video encoding device that encodes the coefficients associated with a block of video data, the video encoding device comprising a prediction unit that performs intraprediction encoding of the data block of video based on an intracoding mode, a transform unit that determines a transform size and performs a transform on the video data block according to the transform size, and a scan unit that selects a scan order for the coefficients based on the intracoding mode used to predict the video data block and the size of the transform block used in the video data block transform and generate a syntax element to communicate the selected scan order for the data block of video.
[011] In another example, this description describes a video decoding device that decodes the coefficients associated with a block of video data, the video decoding device comprising a unit that receives a syntax element with the data block, where the syntax element defines a scan order from a set of higher scan order candidates, and a scan unit that defines the higher scan order candidate set based on one or both of a used intracoding mode to predict the video data block and a transform block size used in the video data block transform. The scanner unit reverses the video data block from a serialized representation of the video data block to a two-dimensional representation of the video data block based on the syntax element with respect to the defined set of candidates. higher scan order.
[012] In another example, this description describes a device that encodes coefficients associated with a block of video data, the device comprising a device for selecting a scan order for the coefficients based on an intracoding mode used to predict the block of video data and for coefficients based on an intracoding mode used to predict the video data block and a transform block size used in the video data block transform, and means for generating a syntax element for communicate the selected scan order for the video data block.
[013] In another example, this description describes a device that decodes the coefficients associated with a video data block, the device comprising means for receiving a syntax element with the video data block, where the syntax element defines a scan order of a set of higher scan order candidates, means for defining the set of higher scan order candidates based on one or both of an intracoding mode used to predict the video data block and a transform block size used in transforming the video data block, and means for inversely transforming the block size used in transforming the video data block, and means for inverted digitizing the video data block a from a serialized representation of the video data block to a two-dimensional representation of the video data block based on the syntax element with respect to defined set of higher scan order candidates.
[014] The techniques described in this description can be implemented in hardware, software, firmware or combinations thereof. If implemented in hardware, an appliance can be created as an integrated circuit, a processor, a discrete logic, or any combination thereof. If implemented in software, the software can run on one or more processors, such as a microprocessor, application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), or digital signal processor (DSP). The software that performs the techniques can be initially stored on a tangible computer-readable storage medium and loaded and executed on the processor.
[015] Accordingly, this description also contemplates a computer-readable storage medium comprising instructions that, when executed, cause a processor to encode the coefficients associated with a block of video data, where the instructions cause the processor to select a scan order for the coefficients based on an intracoding mode used to predict the video data block and a transform block size used in the video data block transform, and generate a syntax element to communicate the order selected scan for the video data block.
[016] Additionally, this description describes a computer-readable medium comprising instructions that, when executed, cause a processor to decode the coefficients associated with a block of video data, where the instructions cause the processor, after receiving a syntax element with the video data block, where the syntax element defines a scan order from a set of higher scan order candidates, define the higher scan order candidate set based on one or both between an intercoding mode used to predict the video data block and a transform block size used in the video data block transform, inversely digitize the video data block from a serialized representation of the data block. video data for a two-dimensional representation of the video data block based on the syntax element relative to the defined set of candidates d and higher scan order.
[017] Details of one or more aspects of the description are presented in the attached drawings and in the description below. Other features, objectives and advantages of the techniques described in this description will be apparent from the description and drawings, and from the claims. Brief Description of the Drawings Figure 1 is a block diagram illustrating a video encoding and decoding system that can implement one or more of the techniques in this description; Figure 2 is a block diagram illustrating an illustrative video encoder consistent with one or more examples of this description; Figure 3 is a block diagram illustrating an illustrative video decoder consistent with one or more examples of that description; Figure 4 is a conceptual diagram of video blocks divided according to a quadtree division scheme; Fig. 5 is a decision tree representing the split decisions that result in the quadtree split illustrated in Fig. 4; Figures 6A through 6C are conceptual diagrams illustrating illustrative scan orders, including a zigzag scan (Figure 6A), a horizontal scan (Figure 6B), and a vertical scan (Figure 6C); Figure 7 is a conceptual diagram illustrating illustrative prediction modes consistent with the emerging HEVC pattern; Figure 8 is another conceptual diagram of video blocks divided according to a quadtree division scheme; Figures 9 to 13 are tables that can be used to implement techniques of this description; Figures 14 through 16 are flowcharts illustrating techniques consistent with this description. Detailed Description of the Invention
[018] This description refers to scanning techniques performed on coefficients of a video data block. Coefficients can comprise significant coefficients and zero value coefficients, which are binary or indicator values (that is, 0 or 1) that define whether the residual transform coefficients are significant (that is, non-zero) or not (that is, is, equal to zero). Significant coefficients can define a significance map that defines which of the residual transform coefficients of a video block are significant and which are not significant. Level values can be defined together with significant coefficients in order to define the actual values of non-zero residual transform coefficients. In this case, the significant coefficients and the zero value coefficients define whether the residual transform coefficients are significant (that is, non-zero) or not (that is, equal to zero), and the level values define the actual values for the transform coefficients that are significant.
[019] The residual transform coefficients can comprise a block of transform coefficients in the frequency domain, which represents an energy distribution associated with a set of residual pixel values. The residual pixel values can comprise a block of values representing the residual differences between a block of video data being encoded, and a block of predictive video data used for predictive encoding, in the spatial domain. Residual pixel values can be quantized or not in different cases, and the techniques in this description can apply to either or both of these cases. The prediction data block may be intraprediction data from the same frame or slice as the video block being encoded, or it may be interprediction data defined from a different frame or slice with respect to the video block being encoded. A transform, such as a discrete cosine transform (DCT) or conceptually similar process, can be applied to the residual pixel values to produce transform coefficients in a frequency domain. A significance map of the significant coefficients can be created to represent whether the transform coefficients are significant (ie, non-zero) or not (ie, equal to zero).
[020] In video coding, scanning techniques are typically performed to serialize a block of coefficients from a two-dimensional representation to a one-dimensional representation. In many cases, after the transform, residual coefficients that are located near the upper right corner of a video block are more likely to be significant, although the location of the high energy coefficients may be located elsewhere, due to capacity direction of the transform. So-called zigzag scanning can be an efficient scanning technique to serialize a block of significant coefficients (or a block of residual transform coefficients) from a two-dimensional representation to a one-dimensional representation in order to group the coefficients, for example, close. front of the serialized one-dimensional representation. However, other scanning techniques (such as horizontal scan, vertical scan, zigzag and horizontal scan combinations, zigzag and vertical scan combinations, adaptive scan, or other more complex scan patterns) may be more effective. In so many cases. Some intracoding modes often result in significant coefficient distributions that are oriented towards the left vertical edge of the block or the upper block edge of the block. In such cases, using a different scan order (eg non-zigzag) can improve the encoding efficiency of video encoding so as to improve video compression.
[021] This description describes the techniques in which different scan orders are defined and used for different intraprediction modes based on the intraprediction mode type and the transform block size used in the transform of a given block of transform coefficients. In one example, the selection can be between horizontal, vertical and raster scan orders, although other scan orders are also supported. The desired scan order, for example in the direction of encoding bit rate X distortion in the encoding process, can be determined in the encoder by searching between the zigzag, horizontal and vertical scan orders, and comparing the results in terms of compression and video quality. The scan order selected by the encoder can be passed as an index into the bit stream (eg a block-level syntax) to the decoder. The performance improvements that can result from using different scan orders can add complexity to the encoder due to the exhaustive search for the best scan order. For this reason, additional techniques can limit the search level to mitigate such complexity in the encoder.
[022] In one example, a mode dependent fixed transform coefficient encoding technique for intrablock encoding is proposed. The mode-dependent fixed transform coefficient encoding technique can associate the scan order (also called “scan order”) with the intraprediction mode, which means that the scan order for an intraprediction mode can be fixed to a size determined. The encoder can avoid an exhaustive search across multiple scan orders in order to reduce complexity as there are only a few possible scan orders, and at the same time, techniques can exploit some of the benefits associated with an exhaustive search of all possible orders scanning. Fixing the scan order for both encoding/decoding may be particularly desirable to support parallel implementation by both encoding and decoding devices.
[023] The techniques of this description can be applied to intrablock coding. In H.264/AVC and the emerging HEVC pattern Test Model, directional extrapolation methods can be used to predict an intrablock. Due to directional prediction (ie, based on intradata within the same video slice), the residual block (in the pixel domain) typically exhibits directional characteristics, which are then inherited into the transformed coefficient block (in the transform domain). For this reason, a mode-dependent scanning scheme for transform coefficients (or simply for significant coefficients of a meaningful map) can be very useful for improving coding efficiency.
[024] In another example, a mode-dependent transform coefficient coding technique can be used. In this case, higher scan order candidates can be defined for each predictive mode, each transform size, or for combinations of predictive mode and transform size. Candidates in the set (that is, higher scan order candidates) can differ based on both the prediction mode and the block size used. In this case, the best scan order for a block can be determined by the encoder among the candidates in the set specified for the prediction mode, but the number of candidates can be less than the total number of possible scan orders. The technique can add some complexity to the encoder, but it can limit the complexity to a fixed number of higher candidate scan orders. The decoder can define the same set of superior candidates as those defined in the encoder so that the decoder can properly interpret the syntax elements to detain the proper scan order for use.
[025] Figure 1 is a block diagram illustrating an illustrative video encoding and decoding system 10 that can implement techniques of this description. As illustrated in Figure 1, system 10 includes a source device 12 that transmits encoded video to a destination device 16 over a communication channel 15. The source device 12 and the destination device 16 may comprise any of a wide variety. of devices. In some cases, source device 12 and destination device 16 may comprise wireless communication device apparatus, such as cellular or satellite radiotelephones. The techniques in this description, however, which are generally applied to scanning techniques in video encoding and video decoding, are not necessarily limited to wireless applications or configurations, and can be applied to non-wireless devices including encoding capabilities. and/or video decoding. Source device 12 and destination device 16 are merely examples of encoding devices that can support the techniques described herein. Other video devices that may utilize techniques in this description may include digital televisions, digital direct broadcast systems, a wide range of wireless communication devices, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, computers tablet, digital cameras, digital recording devices, video game devices, video game consoles, personal multimedia devices, and the like.
[026] In the example of Figure 1, the source device 12 may include a video source 20, a video encoder 22, a modulator/demodulator (modem) 23, and a transmitter 24. The target device 16 may include a receiver 26, a modem 27, a video decoder 28, and a display device 30. According to this description, the video encoder 22 of the source device 12 can be configured to perform the scanning techniques of that description during an encoding process. to serialize the coefficients of a video data block from a two-dimensional block format to a one-dimensional format. Syntax elements can be generated in video encoder 22 in order to signal how the coefficients have been digitized so that video decoder 28 can perform an alternate (ie, inverted) scan. In some examples, both video encoder 22 and video decoder 28 can be configured to determine a set of higher scan order candidates, for example, based on contextual information. In other examples, video encoder 22 can determine the scan order and simply encode the scan order into syntax information, for use by video decoder 28.
[027] The video encoder 22 of the source device 12 can encode the video data received from the video source 20 using the techniques of that description. Video source 20 may comprise a video capture device, such as a video camera, a video file containing previously captured video, or a video feed from a video content provider. As a further alternative, the video source 20 can generate computer graphics based data as the source video, or a combination of live video, archived video, and computer generated video. In some cases, if the video source 20 is a video camera, the source device 12 and the target device 16 may form so-called camera phones or video phones. In each case, the captured, pre-captured or computer generated video can be encoded by the video encoder 22.
[028] Once the video data is encoded by the video encoder 22, the encoded video information can then be modulated by the modem 23 according to a communication standard, for example, such as code division multiple access ( CDMA), orthogonal frequency division multiplexing (OFDM) or any other standard or communication technique. The encoded and modulated data can then be transmitted to destination device 16 via transmitter 24. Modem 23 can include various mixers, filters, amplifiers or other components designed for signal modulation. Transmitter 24 can include circuitry designed for data transmission, including amplifiers, filters and one or more antennas. Receiver 26 of destination device 16 receives information over channel 15, and modem 27 demodulates the information. Again, the video decoding process performed by video decoder 28 may include alternate scanning techniques than those used by video encoder 22.
[029] The communication channel 15 may comprise any wired or wireless communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines, or any combination of wired or wireless media. The communication channel 15 may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. Communication channel 15 generally represents any suitable communication medium, or collection of different communication media, for transmitting video data from source device 12 to destination device 16.
[030] Video encoder 22 and video decoder 28 can operate substantially in accordance with a video compression standard such as an emerging HEVB standard. However, the techniques in this description can also be applied in the context of a variety of other video encoding standards, including some old standards, or new emerging standards.
[031] Although not illustrated in Figure 1, in some cases, the video encoder 22 and the video decoder 28 can each be integrated with an audio encoder and decoder, and may include suitable MUX-DEMUX units, or other hardware and software, to handle encoding both audio and video into a common data stream or separate data streams. If applicable, MUX-DEMUX units can conform to ITU H.223 multiplexer protocol, or other protocols such as User Datagram Protocol (UDP).
[032] Video encoder 22 and video decoder 28 can each be implemented as one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate assemblies (FPGAs), discrete logic, software, hardware, firmware, or any combination thereof. Each video encoder 22 and video decoder 28 can be included in one or more encoders or decoders, any of which can be integrated as part of a combined encoder/decoder (CODEC) in a respective mobile device, subscriber device, device broadcast, server, or similar. In this description, the term encoder refers to an encoder or decoder, and the terms encoder, and decoder refer to specific machines designed for encoding (encoding or decoding) video data consistent with that description.
[033] In some cases, devices 12, 16 can operate in a substantially symmetrical way. For example, each of devices 12, 16 may include video encoding and decoding components. As such, system 10 can support one-way or two-way video transmission between video devices 12, 16, for example, for video sequencing, video playback, video broadcasting, or video telephony.
[034] During the encoding process, the video encoder 22 can perform various encoding techniques or operations. In general, video encoder 22 operates on video blocks within individual video frames (or other independently defined units of video such as slices) in order to encode the video blocks. Frames, slices, parts of frames, groups of images or other data structures can be defined as units of video information that include a plurality of video blocks. Video blocks can be fixed or variable sizes, and can differ in size according to a specified encoding standard. In some cases, each video frame may include a series of independently decodable slices, and each slice may include a series of video blocks, which can be arranged in even smaller blocks.
[035] Macro blocks are a type of video block defined by the ITU H.264 standard and other standards. Macro blocks typically refer to 16 x 16 data blocks. The ITU-T H.264 standard supports intraprediction in various block sizes, such as 16 x 16, 8 x 8 or 4 x 4 for luminescence components, and 8 x 8 for chrominance components, plus interprediction on various block sizes such as 16 x 16, 16 x 8, 8 x 16, 8 x 8, 8 x 4, 4 x 8 and 4 x 4 for luminescence components and corresponding stepped sizes for chrominance components.
[036] The emerging HEVC standard defines new terms for video blocks. In particular, with HEVC, video blocks (or partitions thereof) can be referred to as “coded units” (CUs). With the HEVC standard, larger coded units (LCUs) can be divided into smaller and smaller CUs according to a quadtree division scheme, and different CUs that are defined in the scheme can be further divided into predictive units (PUs). LCUs, CUs, and PUs are all video blocks within the meaning of this description. Other types of video blocks can be used as well, consistent with the HEVC standard or other video encoding standards. As such, the phrase “video block” refers to any video block size. Furthermore, video blocks can sometimes refer to video data blocks in the pixel domain, or data blocks in a transform domain such as a discrete cosine transform (DCT) domain, a domain similar to DCT, a wavelet domain, or the like.
[037] Additionally, video blocks (or video data blocks) can also refer to blocks of so-called significant coefficients. In fact, the scanning techniques in this description can be particularly useful in scanning such significant coefficients. Significant coefficients can comprise binary or indicator values (ie, 0 or 1) that define whether the residual transform coefficients (again, which can be quantized or not quantized) are significant (ie, non-zero) or not (ie. , equal to zero). Level values can also be used together with significant coefficients to define the actual values of the residual transform coefficients. The residual transform coefficients may comprise a block of frequency domain coefficients that represent an energy distribution associated with a set of residual pixel values. The residual pixel values, in turn, may comprise a block of values representing the different residuals between a block of video data being encoded and a block of predictive video data used for predictive encoding. The predictive data block may be intraprediction data from the same frame or slice as the video block being encoded, or it may be interprediction data defined from a different frame or slice with respect to the video block being encoded. The scanning techniques in this description can be used to select the scan order for blocks of significant coefficients that are intracoded, although similar techniques can also be used for intercoded blocks.
[038] The video encoder 22 can perform predictive encoding in which a block of video being encoded is compared with one or more prediction candidates in order to identify a prediction block. This predictive encoding process can be intra (in which case the prediction data is generated based on neighboring intradata within the same video frame or slice) or inter (in which case the prediction data is generated based on the data from video in previous or subsequent frames or slices). Again, the scanning techniques in this description can be used to select the scan order for blocks of significant coefficients that are intracoded, although similar techniques can also be used for intercoded blocks.
[039] After prediction block generation, the differences between the current video block being encoded and the prediction block are encoded as a residual block, and the prediction syntax (such as a motion vector in the case of intercoding, or a prediction mode in the case of intracoding) is used to identify the prediction block. The residual block (that is, a block of residual values) can be transformed to produce a block of transform coefficients, and the transform coefficients can be quantized. However, the techniques in this description can also be applied in the case of non-quantized transform coefficients. Transform techniques can comprise a DCT process or conceptually similar process, integer transforms, wavelet transforms, or other types of transforms. In a DCT process, as an example, the transform process converts a set of pixel values (eg residual values indicating differences between actual values and prediction values) into transform coefficients, which can represent the energy of the pixel values. in the frequency domain. The HEVC standard allows transforming according to Transform Units (TUs), which can be different for different CUs. TUs are typically sized according to the size of the CUs defined for a split LCU, although this may not always be the case. Quantization is typically applied to transform coefficients, and generally involves a process that limits the number of bits associated with any given transform coefficient.
[040] After the transform and quantization, entropy coding can be performed on the quantized and transformed residual video blocks. Syntax elements such as filter syntax information, division size, motion vectors, prediction modes, or other information can also be included in the entropy encoded bitstream. In general, entropy coding comprises one or more processes that collectively compress a sequence of quantized transform coefficients and/or other syntax information. Scanning techniques can be performed on the quantized transform coefficients in order to define one or more serialized one-dimensional vectors of coefficients from the two-dimensional video blocks. The digitized coefficients are then entropy encoded along with any syntax information, for example, by content adaptive variable length encoding (CAVLC), context adaptive binary arithmetic encoding (CABAC), or other entropy encoding process.
[041] In some cases consistent with this description, quantized transform coefficients are encoded by first encoding a significance map comprising a set of significant coefficients within a transform block and then encoding levels or values of transform coefficients different from zero. Again, the significant coefficients can comprise binary or indicator values (ie, 0 or 1) that define whether the residual transform coefficients are significant (ie, non-zero) or not (ie, equal to zero). Additional information can also be coded to define the actual value or level associated with the significant coefficients in the significance map. The scanning techniques in this description can be applied to scanning the significance map.
[042] In addition to the encoding process, encoded video blocks can be decoded in order to generate video data that is used for encoding based on subsequent prediction of subsequent video blocks. This is often referred to as a decoding circuit of the encoding process, and generally reproduces the decoding that is performed by a decoding device. In the decoding circuitry of an encoder or a decoder, filtering techniques can be employed in order to improve the video quality, for example, smooth the pixel boundaries and possibly remove artifacts from the decoded video. This filtering can be in-circuit or out-of-circuit. With in-circuit filtering, filtering of the reconstructed video data takes place in the encoding circuit, which means that the filtered data is stored by an encoder or a decoder for subsequent use in predicting the subsequent image data. In contrast, with post-circuit filtering the filtering of reconstructed video data takes place outside the encoding circuit, which means that unfiltered versions of data are stored by an encoder or decoder for subsequent use in predicting image data subsequent ones. Circuit filtering often follows a separate unblocking filtering process, which typically applies filtering to pixels that are at or near the boundaries of adjacent video blocks in order to remove blocking artifacts that manifest at video block boundaries.
[043] Figure 2 is a block diagram illustrating a video encoder 50 consistent with this description. Video encoder 50 may correspond to video encoder 22 of device 20, or a video encoder of a different device. As illustrated in Figure 2, the video encoder 50 includes a predictor unit 32, adders 48 and 51, and a memory 34. The video encoder 50 also includes a transform unit 38 and a quantization unit 40 in addition to a unit of inverse quantization 42 and an inverse transform unit 44. The video encoder 50 also includes an entropy coding unit 46, and a filter unit 47, which may include deblocking filters and post-circuit and/or in-circuit filters. . The encoded video data and syntax information defining the encoding form can be communicated to the entropy encoding unit 46. The entropy encoding unit 46 includes a scanning unit, which can perform the scanning techniques of that description.
[044] In particular, the scanning unit 45 can perform a method of encoding coefficients associated with a block of video data. The video data block can comprise a CU within an LCU, where the LCU is divided into a set of CUs according to a quadtree division scheme, consistent with the emerging HEVC standard. In encoding coefficients, the scan unit 45 can select a scan order for the coefficients (e.g. significant coefficients) based on an intracoding mode used by the prediction unit 42 to predict the video data block and a size blocks of transform used by the transform unit 38 to transform the video data block. Scan unit 45 can generate a syntax element to communicate the selected scan order for the video data block. As explained in more detail below, the scan unit 45 can select the scan order from a first look-up table for luminescence blocks, and can select the scan order from a second look-up table for chrominance blocks. The different look-up tables can be stored in memory 34, which can be accessible by scanning unit 45, or can be stored in other memory accessible by scanning unit 45.
[045] In some cases, instead of selecting the scan order from every possible scan order, scan unit 45 can define a set of higher scan order candidates and select from the set of order candidates top sweep. In this case, the decoder can be configured to define the same set of higher scan order candidates as those defined in the encoder. Accordingly, the signaling between the encoder and decoder can be a switched signaling scheme where an index value can define which higher scan order candidates are used. The decoder can receive the index value, define the same set of higher scan order candidates, and apply the index value to determine which higher scan order candidates to use.
[046] In a switched signaling example, the scan unit 45 defines a set of higher scan order candidates for each of a plurality of possible intracoding modes based on a set of possible scan runs candidates, selects the scan order from the higher scan order candidate set for the intracoding mode used to predict the video data block, and generate the syntax element to identify the selected scan order from the higher candidate set associated with the intracoding mode used to predict the video data block.
[047] In another example of switched signaling, the scan unit 45 defines a set of higher order scan candidates for each of a plurality of possible transform block sizes based on a set of possible higher order candidates, selects the scan order from the higher scan order candidate set for the intracoding mode used to predict the video data block, and generates the syntax element to identify the selected scan order from the candidate set values associated with the intracoding mode used to predict the video data block.
[048] In another example of switched signaling, the scan unit 45 defines a set of higher scan order candidates for combinations of possible intracoding modes and possible transform block sizes based on a set of possible higher order candidates. scan defined for the combinations of possible intracoding modes and possible transform block sizes, selects the scan order from the set of higher scan order candidates for the intracoding mode used to predict the video data block, and generates the syntax element to identify the scan order selected from the set of top candidates associated with the intracoding mode used to predict the video data block.
[049] Generally, during the encoding process, the video encoder 50 receives a block of video to be encoded, and prediction unit 32 performs the predictive encoding techniques. The video block may comprise a CU as outlined above, or may generally comprise any video data block consistent with a block-based video encoding technique or standard. For intercoding, the prediction unit 32 compares the video block to be encoded with several blocks in one or more video reference frames or slices (eg one or more reference data “lists”) in order to define a preview block. Again, for intracoding, prediction unit 32 generates a prediction block based on neighboring data within the same coded unit. The prediction unit 32 sends the precision block and the adder 48 subtracts the prediction block from the video block being encoded in order to generate a residual block.
[050] Alternatively, for intercoding, the prediction unit 32 may comprise motion estimation and motion compensation units that identify a motion vector that points to a prediction block and generates the prediction block based on the motion vector. Typically, motion estimation is considered the motion vector generation process, which estimates motion. For example, the motion vector can indicate the offset of a prediction block within a prediction frame with respect to the current block being encoded within the current frame. Motion compensation is typically considered the process of collecting or generating the prediction block based on the motion vector determined by motion estimation. In some cases, motion compensation for intercoding may include interpolations for subpixel resolution, which allows the motion estimation process to estimate the motion of video blocks for such a subpixel resolution.
[051] After the prediction unit 32 sends the prediction block, and after the adder 48 subtracts the prediction block from the video block being encoded in order to generate a residual block, the transform unit 38 applies a transform to the residual block. The transform may comprise a discrete cosine transform (DCT) or a conceptually similar transform as defined by the ITU H.264 standard or HEVC standard. So-called “butterfly” structures can be defined to perform the transforms, or matrix-based multiplication can also be used. In some examples, consistent with the HEVC standard, the transform size can vary for different CUs, for example, depending on the level of division that occurs with respect to a given LCU. The transform units (TUs) can be defined in order to configure the transform size applied by the transform unit 38. Wavelet transforms, integer transforms, subband transforms or other types of transforms can also be used. In either case, the transform unit applies the transform to the residual block, producing a block of residual transform coefficients. The transform, in general, can convert residual information from a pixel domain to a frequency domain.
[052] The quantization unit 40 then quantizes the residual transform coefficients to further reduce the bit rate. The quantization unit 40, for example, can limit the number of bits used to encode each of the coefficients. After quantization, the entropy encoding unit 46 can digitize and entropy encode the data. Again, this sweep can be applied to so-called significant coefficients, which define whether each of the quantized and transformed coefficients is significant (that is, non-zero). In this way, the scan unit 45 can receive a set of quantized and transformed coefficients, generate a significance map (in addition to levels or values associated with any significant coefficients), and select and apply a scan order to the significance map. The entropy coding unit 46 can then apply entropy coding to the digitized coefficients and other values and syntax elements in the encoded bitstream.
[053] In particular, as noted above, the scanning unit 45 can perform a method of encoding coefficients associated with a block of video data, which can comprise the set of significant coefficients that form the significance map. In encoding the coefficients, the scan unit 45 can select a scan order for the coefficients (e.g. significant coefficients) based on an intracoding mode used by the prediction unit 42 to predict the video data block and a size block transform used by the transform unit 38 to transform the video data block. Scan unit 45 can generate a syntax element to communicate the selected scan order for the video data block. Transform unit 38 and prediction unit 32 can supply contextual information (eg mode and block size) as syntax information to entropy encoding unit 46.
[054] Once the significance map is digitized by the scan unit 45, the entropy encoding unit 46 encodes the quantized transform coefficients (for example, by encoding different elements that define the significance map and associated levels to any non-zero coefficient) according to an entropy coding methodology. Examples of entropy coding techniques that can be used by the entropy coding unit 46 include context adaptive variable length coding (CAVLC) and context adaptive binary arithmetic coding (CABAC). The syntax elements included in the entropy encoded bit stream may include prediction syntax from prediction unit 32, such as motion vectors for intercoding or prediction modes for intracoding. The syntax elements included in the entropy encoded bitstream may also include filter information from filter unit 47, and the transform block size applied to the video block, for example, from transform unit 38.
[055] CAVLC is a type of entropy coding technique supported by an ITU H.264 standard and the emerging HEVC standard, which can be applied on a vectorized basis by entropy coding unit 46. CAVLC uses length coding tables variable (VLC) in a way that effectively compresses the serialized “rounds” of coefficients and/or elements and syntax. CABAC is another type of entropy coding technique supported by the ITU H.264 standard or HEVC standard, which can be applied on a vectorized basis by the entropy coding unit 46. CABAC can involve several stages including binarization, context template selection , and binary arithmetic coding. In this case, the entropy encoding unit 46 encodes the coefficients and syntax elements according to CABAC. Many other types of entropy encoding techniques also exist and new entropy encoding techniques will likely emerge in the future. This description is not limited to any specific entropy encoding technique.
[056] Following entropy encoding by entropy encoding unit 46, encoded video can be streamed to another device or archived for later transmission or retrieval. Again, encoded video can comprise entropy encoded vectors and various syntax information (including syntax information that informs the decoder of the scan order). Such information can be used by the decoder to properly configure the decoding process. The inverse quantization unit 42 and the inverse transform unit 44 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain. The adder 51 adds the reconstructed residual block to the prediction block produced by the prediction unit 32 to produce a reconstructed video block for storage in memory 34. Prior to such storage, however, the filter unit 47 may apply filtering to the block to improve the video quality. Filtering applied by filter unit 47 can reduce artifacts and smooth pixel boundaries. Furthermore, filtering can improve compression by generating preview video blocks that comprise close matches to the video blocks being encoded.
[057] Fig. 3 is a block diagram illustrating an example of a video decoder 60, which decodes a video sequence that is encoded as described here. The scanning techniques in this description can be performed by video decoder 60 in some examples. A video stream received at video decoder 60 may comprise an encoded set of picture frames, a set of frame slices, a commonly encoded set of pictures (GOPs), or a wide variety of video information units that includes encoded video blocks (such as CUs or macro blocks) and syntax information to define how to decode such video blocks. In some cases, the inverted scan unit 55 may simply apply the scan order that is signaled in the encoded bit stream. However, in the switched signaling examples, the reverse scan unit 55 may need to determine the higher scan order candidates in the same way that the scan unit 45 of the encoder 50 determined the higher scan order candidates.
[058] The video decoder 60 includes an entropy decoding unit 52, which performs the alternate decoding function of the encoding performed by the entropy encoding unit 46 of Fig. 2. In particular, the entropy decoding unit 52 can perform CAVLC or CABAC decoding, or any other type of entropy decoding used by the video encoder 50. Before such entropy decoding, however, the reverse scan unit 55 is invoked by the entropy decoding unit 52 to reconvert the block of video data (eg significance map) from a one-dimensional serialized format back to a two-dimensional block format. Level values associated with any significant coefficients in the significance map can also be decoded.
[059] In an example consistent with switched signaling, at least a part of a method of decoding coefficients associated with a block of video data is performed by the inverse scan unit 55. In particular, the inverse scan unit 55 may receive a syntax element with the video data block, where the syntax element defines a scan order from a set of higher scan order candidates. The inverse scan unit 55 can define the set of higher scan order candidates based on one or both of an intracoding mode used to predict the video data block and a transform block size used in the transform of the video block. video data, and inversely digitize the video data block from a serialized representation of the video data block to a two-dimensional representation of the video data block based on the syntax element with respect to the defined set of candidates of higher scan order. Again, the coefficients may comprise significant coefficients and zero valued coefficients, and the video data block may comprise a significance map defining the significant coefficients and zero valued coefficients. Levels can be defined and communicated for those transform coefficients that are identified as being significant.
[060] The video decoder 60 also includes a prediction unit 54, an inverse quantization unit 56, an inverse transform unit 58, a memory 62, and an adder 64. In particular, as the video encoder 50, the Video decoder 60 includes a predictor unit 54 and a filter unit 57. The prediction unit 54 of video decoder 60 may include motion compensating elements and possibly one or more interpolation filters for subpixel interpolation in the process of motion compensation. The filter unit 57 can filter the output of the adder 64, and can receive the entropy decoded filter information so as to define the filter coefficients applied to circuit filtering.
[061] Scan order signaling can occur block by block, eg for macroblocks in H.264 or for CUs in HEVC. Consistent with the emerging HEVC standard, a video data slice can correspond to a set of LCUs that define some or all of the video frames. LCUs can refer to encoded units within the HEVC framework, which can be subdivided into smaller CUs according to quadtree partition. With quadtree partitioning, a square shaped LCU is divided into four square shaped encoded units, and the encoded units can also be subdivided into smaller encoded units according to the quadtree partition. Indicators can be associated with each CU to indicate if additional quadtree partition was used. One LCU can be subdivided into four CUs, and the four CUs can each be divided into smaller CUs. The HEVC standard can support up to three quadtree partition levels from the original LCU, or possibly more. After partitioning LCUs into multiple CUs, the different CUs can be further split into PUs, which are the prediction sizes used to predict the CU. PUs can take on square shapes or other rectangular shapes.
[062] Figures 4 and 5 illustrate an example of how a video block within a slice (eg an LCU) can be divided into sub-blocks (eg smaller CUs). As illustrated in Figure 4, quadtree subblocks indicated by “ON” can be filtered by circuit filters, while quadtree subblocks indicated by “OFF” may not be filtered. The decision of whether or not to filter a particular block or sub-block can be determined in the encoder by comparing the filtered result and the unfiltered result with respect to the original block being coded. Figure 5 is a decision tree representing the split decisions that result in quadtree split illustrated in Figure 4. Figures 4 and 5 can be viewed individually or collectively as a filter map that can be generated in an encoder and communicated to a decoder at least once per slice of encoded video data. The scan order used for any transform unit defined for a given CU can be determined in the video encoder and communicated to the video decoder as part of the block-level syntax.
[063] With the emerging HEVC pattern, transforms (such as discrete cosine transforms), integer transforms, Karhunen-Loeve transforms, or the like can be used to decompose the coefficients of a residual block of data into the frequency domain. Then, a “significance map” that illustrates the distribution of significant (ie, non-error) coefficients in the transform block can be coded. The levels (that is, the actual values) associated with the significant coefficients can also be coded. A scanning unit (such as scanning unit 45 in figure 2) can be used to perform these encoding steps.
[064] To efficiently encode the significance map, a zigzag scan order can be used with a general consideration that most nonzero coefficients should be located in the low frequency area (upper left corner) of a transform block. However, to further improve the efficiency of coding transform coefficients, additional scan orders (such as horizontal and vertical scanning) can be used in cases where the coding efficiency is improved. Horizontal scanning, for example, can follow a raster scan order. More complex and adaptive scan orders can also be defined and used, consistent with the techniques in this description.
[065] In this description, the scan unit 45 implements the different scan orders for different intraprediction modes based on the intraprediction mode type and transform block size used in transforming a given block of transform coefficients. In one example, the selection might be between horizontal, vertical, and raster sweep orders. The desired scan order, for example in the rate distortion direction, can be decided by the video encoder 50 (eg by the scan unit 45) by searching between zigzag, horizontal and vertical (or other) orders. scan orders, if desired). The selected scan order can be transmitted within the bit stream (eg as block-level syntax) to the decoder.
[066] Consistent with the HEVC standard, the transform unit 38 can support transforms of different sizes. For example, 128 x 128 transforms, 64 x 64 transforms, 32 x 32 transforms, 16 x 16 transforms, 8 x 8 transforms, 4 x 4 transforms, and 2 x 2 transforms can all be supported. This description, in one example, describes a fixed transform coefficient dependent coding technique so that it may be specifically desirable for intrablocks. The mode dependent fixed transform coefficient encoding technique can associate the scan order with the intraprediction mode, which means that the scan order for an interprediction mode can be fixed to a given transform size. Scan unit 45 can avoid an exhaustive search among multiple scan orders in order to reduce complexity as if there are only a few possible scan orders, and at the same time, scan unit 45 can exploit some of the benefits associated with multiples. sweep orders. Fixing the scan order for both encoding and decoding can be particularly desirable for easy, parallel implementation.
[067] In one example, the techniques in this description may focus on intrablock coding. In H.264/AVC and the emerging HEVC pattern Test Model, directional extrapolation methods can be used to predict an intrablock. Due to directional prediction (ie, based on intradata within the same video slice), the residual block (in the pixel domain) typically exhibits directional characteristics, which are then inherited into the transformed coefficient block (in the transform domain). For this reason, a mode-dependent scanning scheme for transform coefficients (or simply for significant coefficients from a significance map) can be very useful for improving coding efficiency.
[068] Although the example below discusses three scan orders, additional scan orders can also be defined and used. The three illustrative scan orders can include: a zigzag scan (figure 6A), a horizontal scan (figure 6B), and a vertical scan (figure 6C). Variations such as zigzag sweep combinations, and vertical or horizontal sweep can be used as well, in addition to more complex (and possibly adaptive) sweep orders.
[069] Illustrative intraprediction directions and mode index, consistent with the HEVC Test Model, are illustrated in Figure 7. The block can have a size between 2x2 and 128x128. A mapping function F(*) in equation (1) below, can be constructed to map an intraprediction mode (predMode) of a block with size (blkSize) to a scan order (scanIdx) between the zigzag scan patterns , horizontal and vertical (or other scan orders). Techniques can define scanIdx values for three scan orders as shown in Table 1 given in Figure 9. scanIdx = F(predMode,blkSize) Equation (1)
[070] The mapping function can comprise a mathematical model or a mapping table that can be stored in both the encoder 50 and the decoder 60. In one example, the mapping function can be implemented as a mapping table as shown in Table 2 in Figure 10. In this case, the intramode index of 1-33 (ie, predMode) and the transform size (ie, blkSize) can be indexes to the table that maps a scan order value from Table 1 given in figure 9. Table 1 provided in figure 9 can be a fixed table stored in the encoder and decoder, and the mode indices of the table can be selected based on empirical testing of several different video sequences.
[071] Table 2 of figure 10 can comprise a table applied with respect to "luminescence blocks". For chrominance blocks, a similar lookup table approach can be used to determine scan directions using the table below. The chrominance lookup table can have much fewer modes as chrominance blocks can only support 4 intramodes, in an example consistent with the emerging HEVC pattern. Table 3 of Figure 11 may comprise the corresponding chrominance lookup table with respect to the luminescence lookup table (Table 2) of Figure 10.
[072] In a further example, a mode-dependent switchable transform coefficient encoding technique can be used by scan unit 45 and scan unit 55. In this case, higher scan order candidates can be defined for each mode. for each transform size, or for different combinations of prediction mode and transform size. In this case, the best scan order for a block can be determined to scan unit 45 among the top candidates in the set for a given prediction mode, but the number of candidates can be less than the total number of possible scan orders. This technique can add some complexity to encoder 50 (and decoder 60), but at the same time it can limit the complexity to a fixed number of candidate scan orders. Furthermore, the candidates may be different depending on the intraprediction mode, transform block size, or both intraprediction mode and transform block size.
[073] Table 4 in Figure 12 illustrates an illustrative mapping of possible scan orders to index values. Scan units 45 and 55 can identify a tag function for each scan order “Fs(*)” according to equation (2) and this tag function can calculate with what probability a scan order can be selected for be the block candidate with blkSize size and predMode prediction mode. s(scanIdx) = Fs(predMode, blkSize, scanIdx) Equation (2)
[074] Based on the flag value s, calculated by equation (2), the scan units 45 and 55 can define the top three candidates for each prediction mode predMode associated with a given transform block size blkSize. Table 5 of Figure 13 provides an example of a candidate table that is generated by the above procedure for a given block size. According to this technique, scan unit 45 can signal one of three states (candidate0, candidate1 and candidate2) in this switchable scheme, but candidates can map different scan orders depending on the mode. Scan unit 55 can apply the same candidate table so that three states properly map to the correct candidates.
[075] According to the generated candidate table, a switchable scheme can be used to search the best scan order towards the rate distortion cost between candidates for a block having a particular prediction mode and a size of particular block. The best scan order can be signaled in the bitstream as video block syntax. Since the candidate set is typically smaller than the set of all allowed scan orders, the amount of signaling information can be significantly reduced with respect to a scheme that signals a choice with respect to all possible candidate scan orders.
[076] Figure 14 is a flowchart illustrating a technique consistent with this description. Figure 14 will be described from the perspective of the video encoder 50 of Figure 2, although other devices can perform similar techniques. In particular, Fig. 14 illustrates a method of coding coefficients (e.g., significant coefficients) associated with an intracoded block of video data. Scan unit 45 determines an intracoding mode (1401) and determines a transform block size that was used in the encoding process (1402). In particular, the transform unit 38 and the prediction unit 32 can communicate syntax information to the scanner unit 45 to facilitate such determinations. Transform unit 38 can communicate transform size to scan unit 45 and predictor unit can communicate intracoding mode to scan unit 45. Scan unit 45 can then select a scan order based on the mode intracoding and transform block size (1403), and generating a syntax element to communicate the selected scan order (1404). Following the entropy encoding by the entropy encoding unit 46, the video encoder 50 can output the syntax element with the encoded video data (1405). In this way, the scan order can be encoded as part of an encoded video bitstream, and sent for communication to another device so that another device can determine how to perform the alternate (inverse) scan during a decoding process.
[077] Figure 15 is a flowchart illustrating a switched signaling technique consistent with this description. Figure 15 will be described from the perspective of the video encoder 50 of Figure 2, although other devices may perform similar techniques. As illustrated in Figure 15, scan unit 45 defines a set of higher scan order candidates (1501) and selects a scan order from the set of higher scan order candidates based on an intracoding mode, a transform block size, or both the intracoding mode and transform block size used to encode the video data block (1502). The set of higher scan order candidates can be programmed in scan unit 45 for intracoding modes, transform block sizes or combinations of intracoding mode and transform block size. By defining a set of higher scan order candidates, the number of possibilities can be limited in order to limit computations and limit the amount of signaling overhead. Scan unit 45 generates a syntax element to communicate the selected scan order (1503), and following entropy coding by entropy coding unit 46, video encoder 50 can output the syntax element with the data of encoded video (1504).
[078] In order to support the technique of Fig. 15, both the video encoder 50 and the video decoder 60 can be configured to define the same set of higher scan order candidates in different situations. The syntax element defining the scan order depends on the set of higher scan order candidates, which video encoder 50 and video decoder 60 may each define in a similar way.
[079] Fig. 16 is a flowchart illustrating a switched signaling technique from the perspective of video decoder 60, consistent with this description. Although Fig. 16 is described from the perspective of the video decoder 60 of Fig. 3, other devices can perform similar techniques. As illustrated in Fig. 16, the entropy decoding unit 52 receives a syntax element for a block of video data (e.g., a block of significant coefficients). The inverse scan unit 55 defines a set of higher scan order candidates based on one or both of an intracoding mode used in encoding video data and the used transform block size (1602). This allows the reverse scan unit 55 to properly interpret the received syntax element, which can comprise an index value with respect to the higher scan order candidate set. Accordingly, the reverse scan unit 55 performs a reverse scan based on the syntax element with respect to the set of higher scan order candidates (1603).
[080] The techniques in this description can be performed on a wide variety of devices or appliances, including a wireless device, and an integrated circuit (IC) or set of ICs (ie, a set of chips). Any components, modules or units have been described and provided to emphasize functional aspects and do not necessarily require realization by different hardware units.
[081] Accordingly, the techniques described here may be implemented in hardware, software, firmware, or any combination thereof. Any features described as modules or components can be implemented together in an integrated logic device or separately as discrete but interoperable logic devices. If implemented in software, the techniques can be performed at least in part by a computer-readable medium comprising instructions that, when executed, perform one or more of the methods described above. The computer-readable data storage medium may form part of a computer program product, which may include packaging materials.
[082] The computer readable medium may comprise a tangible computer readable storage medium, such as a random access memory (RAM) such as synchronized dynamic random access memory (SDRAM), read-only memory (ROM), memory non-volatile random access memory (NVRAM), programmable, electrically erasable read-only memory (EEPROM), FLASH memory, magnetic or optical data storage media, and the like. The techniques additionally or alternatively can be performed at least in part by a computer-readable communication medium that carries or communicates the code in the form of instructions or data structures and that can be accessed, read and/or executed by a computer.
[083] Instructions can be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuit assembly. The term "processor" as used herein may refer to any of the above frameworks or any other framework suitable for implementing the techniques described herein. Additionally, in some respects, the functionality described here may be provided within dedicated software modules or hardware modules configured for encoding and decoding, or incorporated into a combined video encoder-decoder (CODEC). Furthermore, the techniques can be fully implemented in one or more circuits or logic elements.
[084] Several aspects of the description have been described. These and other aspects are within the scope of the appended claims.
权利要求:
Claims (16)
[0001]
1. Method for encoding (encoding) coefficients associated with a block of video data, in which the coefficients are obtained by transforming the block into video data, the method is CHARACTERIZED by the fact that it comprises: for each combination of possible intracoding modes and possible transform block sizes, define a set of higher scan order candidates, each set of higher scan order candidates being a subset of the possible scan order candidates defined for any combinations of possible intracoding modes and possible transform block sizes; for a block of video data encoded using a determined intracoding mode and having a transform block size, select the defined higher scan order candidate set for a predicted block using the determined intracoding mode and having the block size determined transform, and selecting the scan order from the selected set of higher scan order candidates, where the coefficients are scanned with the selected scan order; and generate a syntax element to communicate the selected scan order from the set of selected top candidates.
[0002]
2. Method, according to claim 1, CHARACTERIZED by the fact that the coefficients comprise significant coefficients and zero value coefficients defining a significance map.
[0003]
3. Method, according to claim 2, characterized by the fact that the significant coefficients comprise one-bit indicators that identify non-zero value coefficients.
[0004]
4. Method according to claim 1, characterized by the fact that the video data block comprises a coded unit (CU) within a larger coded unit (LCU), wherein the LCU is partitioned into a set of CUs according to a quadtree partitioning scheme.
[0005]
5. Method for decoding coefficients associated with a video data block, in which the coefficients are obtained by transforming the video data block, the method is CHARACTERIZED by the fact that it comprises: for each combination of possible and possible intracoding modes transform block sizes, define a set of higher scan order candidates each set of higher scan order candidates being a subset of the possible scan order candidates defined for any combinations of possible intracoding modes and possible block sizes of transformed; for a block of received video data encoded using a given intracoding mode and having a transform block size, select the defined higher scan order candidate set for a predicted block using the given intracoding mode and having the size of determined transform block; receiving a syntax element with the video data block, where the syntax element defines a scan order from the selected set of higher scan order candidates; and reverse scanning the coefficients associated with the video data block from a serialized representation of the coefficients associated with the video data block to a two-dimensional representation of the coefficients associated with the video data block using the scan order a from the selected set of higher scan order candidates identified by the received syntax element.
[0006]
6. Method, according to claim 5, CHARACTERIZED by the fact that the coefficients comprise significant coefficients and zero value coefficients defining a significance map.
[0007]
7. Video encoding device that encodes the coefficients associated with a block of video data, where the coefficients are obtained by transforming the block of video data, the video encoding device is CHARACTERIZED by the fact which comprises: a prediction unit that performs intraprediction coding of the video data block based on an intracoding mode; a transform unit that determines a transform size and performs a transform on the video data block according to the transform size; and a scan unit that: for each combination of possible intracoding mode and possible transform block size, defines a set of higher scan order candidates, each set of higher scan order candidates being a subset of the possible higher scan order candidates. defined scan order for any combinations of possible intracoding modes and possible transform block sizes; for a block of video data encoded using a determined intracoding mode and having the transform block size, selects the set of higher scan order candidates defined for the predicted block using the determined intracoding mode and having a block size of determined transform; and selects the scan order from the selected set of higher scan order candidates, where the coefficients are scanned with the selected scan order; and generates a syntax element to identify the selected scan order from the selected set of top candidates.
[0008]
8. Video encoding device, according to claim 7, CHARACTERIZED by the fact that the coefficients comprise significant coefficients and zero value coefficients defining a significance map.
[0009]
9. Video encoding device, according to claim 8, characterized by the fact that the significant coefficients comprise one-bit indicators that identify non-zero value coefficients.
[0010]
10. Video encoding device according to claim 7, characterized by the fact that the video data block comprises a encoded unit (CU) within a larger encoded unit (LCU), wherein the LCU is partitioned into a set of CUs according to a quadtree partitioning scheme.
[0011]
11. Video encoding device according to any one of claims 7 to 10 characterized by the fact that it comprises one or more of: an integrated circuit; a microprocessor; and a wireless communication device that includes a video encoder.
[0012]
12. Video decoding device that decodes coefficients associated with a video data block, wherein the coefficients are obtained by transforming the video data block, the video decoding device is CHARACTERIZED by the fact that it comprises: a unit of scan that, for each combination of possible intracoding modes and possible transform block sizes, each set of higher scan order candidates being a subset of the possible scan order candidates defined for any combination of possible intracoding mode and possible transform block size; and a unit that, for a received block of video data encoded using a determined intracoding mode and having the transform block size, selects the set of higher scan order candidates defined for the predicted block using the determined intracoding mode. and having the transform block size determined; and which receives a syntax element with the data block, where the syntax element defines a scan order of the selected set of higher scan order candidates, where the scan unit reverses scans the coefficients associated with the blocks. of video data from a serialized representation of the coefficients associated with the video data block to a two-dimensional representation of the coefficients associated with the video data block using the scan order from the selected set of scan order candidates top identified by the syntax element received.
[0013]
13. Video decoding device, according to claim 12, CHARACTERIZED by the fact that the coefficients comprise significant coefficients and zero value coefficients defining a significance map.
[0014]
14. Video decoding device according to claim 12, characterized by the fact that it also comprises: a prediction unit that performs intraprediction decoding of the video data block based on the intracoding mode; and an inverse transform unit that performs an inverse transform with respect to the video data block based on the size of the transform block.
[0015]
15. Video decoding device according to any one of claims 12 to 14, characterized by the fact that it comprises one or more of: an integrated circuit; a microprocessor; and a wireless communication device that includes a video decoder.
[0016]
16. Computer-readable media CHARACTERIZED by the fact that it comprises instructions that, when executed, cause the processor to encode (encode) or decode coefficients associated with a video block, where the coefficients are obtained by the data block transform of video, in which the instructions cause the processor to execute the method as defined in any one of claims 1 to 6.
类似技术:
公开号 | 公开日 | 专利标题
BR112013015895B1|2021-07-27|METHODS FOR ENCODING AND DECODING COEFFICIENTS ASSOCIATED WITH A BLOCK OF VIDEO DATA, VIDEO ENCODING AND DECODING DEVICES, AND COMPUTER-READABLE MEDIA
ES2845673T3|2021-07-27|Fragment level intrablock copy
US9699472B2|2017-07-04|Restriction of prediction units in B slices to uni-directional inter prediction
JP5960309B2|2016-08-02|Video coding using mapped transform and scan mode
JP6396439B2|2018-09-26|Residual differential pulse code modulation | expansion and harmony with conversion skip, rotation, and scanning
ES2777218T3|2020-08-04|Disabling Sign Data Hiding in Video Encoding
US9648334B2|2017-05-09|Bi-predictive merge mode based on uni-predictive neighbors in video coding
KR101617107B1|2016-04-29|Adaptive loop filtering for chroma components
EP2684356B1|2016-01-13|MOTION VECTOR PREDICTORS | FOR BI-PREDICTIVE INTER MODE IN VIDEO CODING
ES2864623T3|2021-10-14|Coding of quantization parameters | in video coding
ES2873548T3|2021-11-03|Escape pixel encoding for palette encoding
BR112021004492A2|2021-05-25|adaptive multiple transform coding
US20210051319A1|2021-02-18|Video signal processing method and device using reference sample
US20210377519A1|2021-12-02|Intra prediction-based video signal processing method and device
KR20210158385A|2021-12-30|Video encoding/decoding method, apparatus, and bitstream transmission method based on intra prediction mode transformation
Chen et al.2014|Hybrid transform for HEVC-based lossless coding
BR112013007302B1|2022-02-15|ENTROPY ENCODING COEFFICIENTS USING A JOINT CONTEXT MODEL
同族专利:
公开号 | 公开日
EP2656607A1|2013-10-30|
US20120163455A1|2012-06-28|
ES2693643T3|2018-12-13|
US9049444B2|2015-06-02|
MY173210A|2020-01-06|
AU2011349686A1|2013-07-04|
SG190930A1|2013-07-31|
KR20130105894A|2013-09-26|
JP5792319B2|2015-10-07|
CA2822259A1|2012-06-28|
WO2012087713A1|2012-06-28|
JP2014504489A|2014-02-20|
RU2013133841A|2015-02-10|
RU2547239C2|2015-04-10|
KR101540528B1|2015-07-29|
BR112013015895A2|2018-06-05|
EP2656607B1|2018-09-12|
CN103270754A|2013-08-28|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5122875A|1991-02-27|1992-06-16|General Electric Company|An HDTV compression system|
US6680975B1|1992-02-29|2004-01-20|Samsung Electronics Co., Ltd.|Signal encoding and decoding system and method|
EP1835761A3|1996-05-28|2007-10-03|Matsushita Electric Industrial Co., Ltd.|Decoding apparatus and method with intra prediction and alternative block scanning|
US6054943A|1998-03-25|2000-04-25|Lawrence; John Clifton|Multilevel digital information compression based on lawrence algorithm|
JP2000013609A|1998-06-23|2000-01-14|Fujitsu Ltd|Encoding device|
US6658159B1|2000-03-17|2003-12-02|Hewlett-Packard Development Company, L.P.|Block entropy coding in embedded block coding with optimized truncation image compression|
EP1368748A2|2001-01-10|2003-12-10|Koninklijke Philips Electronics N.V.|Method and system to encode a set of input values into a set of coefficients using a given algorithm|
US6870963B2|2001-06-15|2005-03-22|Qualcomm, Inc.|Configurable pattern optimizer|
US6795584B2|2002-10-03|2004-09-21|Nokia Corporation|Context-based adaptive variable length coding for adaptive block transforms|
US7688894B2|2003-09-07|2010-03-30|Microsoft Corporation|Scan patterns for interlaced video content|
US7782954B2|2003-09-07|2010-08-24|Microsoft Corporation|Scan patterns for progressive video content|
US20060078049A1|2004-10-13|2006-04-13|Nokia Corporation|Method and system for entropy coding/decoding of a video bit stream for fine granularity scalability|
US20060256854A1|2005-05-16|2006-11-16|Hong Jiang|Parallel execution of media encoding using multi-threaded single instruction multiple data processing|
EP1739971A1|2005-06-28|2007-01-03|Thomson Licensing|Method and apparatus for encoding a video signal|
EP1753242A2|2005-07-18|2007-02-14|Matsushita Electric Industrial Co., Ltd.|Switchable mode and prediction information coding|
WO2007079782A1|2006-01-13|2007-07-19|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Quality scalable picture coding with particular transform coefficient scan path|
CN100546390C|2006-03-16|2009-09-30|清华大学|In picture coding course, realize the method for adaptive scanning|
EP1859630B1|2006-03-22|2014-10-22|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Coding scheme enabling precision-scalability|
RU2406254C2|2006-03-29|2010-12-10|Квэлкомм Инкорпорейтед|Video processing with scalability|
DE602007008730D1|2006-07-13|2010-10-07|Qualcomm Inc|VIDEO-CODING WITH FINE-SCALE SCALABILITY BASED ON CYCLIC-ORIENTED FRAGMENTS|
KR100882949B1|2006-08-17|2009-02-10|한국전자통신연구원|Apparatus and method of encoding and decoding using adaptive scanning of DCT coefficients according to the pixel similarity|
US8428133B2|2007-06-15|2013-04-23|Qualcomm Incorporated|Adaptive coding of video block prediction mode|
US8571104B2|2007-06-15|2013-10-29|Qualcomm, Incorporated|Adaptive coefficient scanning in video coding|
WO2009001793A1|2007-06-26|2008-12-31|Kabushiki Kaisha Toshiba|Image encoding and image decoding method and apparatus|
EP2197215A4|2007-09-06|2011-03-23|Nec Corp|Video encoding device, video decoding device, video encoding method, video decoding method, and video encoding or decoding program|
KR20090097689A|2008-03-12|2009-09-16|삼성전자주식회사|Method and apparatus of encoding/decoding image based on intra prediction|
US8902972B2|2008-04-11|2014-12-02|Qualcomm Incorporated|Rate-distortion quantization for context-adaptive variable length coding |
JP2010004284A|2008-06-19|2010-01-07|Toshiba Corp|Image decoder, and image decoding method|
EP2182732A1|2008-10-28|2010-05-05|Panasonic Corporation|Switching between scans in image coding|
US8737613B2|2008-11-20|2014-05-27|Mediatek Inc.|Scanning methods of transform-based digital data processing that conditionally adjust scan order according to characteristics information and related apparatus thereof|
WO2010143853A2|2009-06-07|2010-12-16|엘지전자 주식회사|Method and apparatus for decoding a video signal|
CN102045560B|2009-10-23|2013-08-07|华为技术有限公司|Video encoding and decoding method and video encoding and decoding equipment|
US20110280314A1|2010-05-12|2011-11-17|Texas Instruments Incorporated|Slice encoding and decoding processors, circuits, devices, systems and processes|
US9661338B2|2010-07-09|2017-05-23|Qualcomm Incorporated|Coding syntax elements for adaptive scans of transform coefficients for video coding|
US8913666B2|2010-10-01|2014-12-16|Qualcomm Incorporated|Entropy coding coefficients using a joint context model|
US9497472B2|2010-11-16|2016-11-15|Qualcomm Incorporated|Parallel context calculation in video coding|
US9866829B2|2012-01-22|2018-01-09|Qualcomm Incorporated|Coding of syntax elements that correspond to coefficients of a coefficient block in video coding|KR101495724B1|2010-02-02|2015-02-25|삼성전자주식회사|Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same|
KR101373814B1|2010-07-31|2014-03-18|엠앤케이홀딩스 주식회사|Apparatus of generating prediction block|
US9497472B2|2010-11-16|2016-11-15|Qualcomm Incorporated|Parallel context calculation in video coding|
EP2663075B1|2011-01-06|2020-05-06|Samsung Electronics Co., Ltd|Encoding method and device of video using data unit of hierarchical structure, and decoding method and device thereof|
US9380319B2|2011-02-04|2016-06-28|Google Technology Holdings LLC|Implicit transform unit representation|
US8878861B2|2011-03-01|2014-11-04|Sony Corporation|Conversion between z-scanning indices, raster-scanning indices and 2-D coordinates using simple bit-operations in HEVC|
US10142637B2|2011-03-08|2018-11-27|Texas Instruments Incorporated|Method and apparatus for parallelizing context selection in video processing|
CN105791834B|2011-06-23|2018-01-02|Jvc 建伍株式会社|Picture decoding apparatus and picture decoding method|
CA2975456C|2011-06-27|2019-10-22|Samsung Electronics Co., Ltd.|Encoding and decoding video by providing a predtermined minimum amount of motion information from spatial and temporal prediction units|
RU2619706C2|2011-06-28|2017-05-17|Самсунг Электроникс Ко., Лтд.|Method and device for encoding video, and method and device for decoding video which is accompanied with internal prediction|
US9008179B2|2011-06-30|2015-04-14|Futurewei Technologies, Inc.|Encoding of prediction residuals for lossless video coding|
EP2773118B1|2011-10-24|2020-09-16|Innotive Ltd|Method and apparatus for image decoding|
US10390016B2|2011-11-04|2019-08-20|Infobridge Pte. Ltd.|Apparatus of encoding an image|
KR20130049524A|2011-11-04|2013-05-14|오수미|Method for generating intra prediction block|
KR20130049522A|2011-11-04|2013-05-14|오수미|Method for generating intra prediction block|
BR112014011123A2|2011-11-08|2017-05-16|Kt Corp|coefficient scan method and apparatus based on prediction unit partition mode|
US9344722B2|2011-11-18|2016-05-17|Futurewei Technologies, Inc.|Scanning of prediction residuals in high efficiency video coding|
GB2501535A|2012-04-26|2013-10-30|Sony Corp|Chrominance Processing in High Efficiency Video Codecs|
CN108632611A|2012-06-29|2018-10-09|韩国电子通信研究院|Video encoding/decoding method, method for video coding and computer-readable medium|
US9264713B2|2012-07-11|2016-02-16|Qualcomm Incorporated|Rotation of prediction residual blocks in video coding with transform skipping|
WO2014084656A1|2012-11-29|2014-06-05|엘지전자 주식회사|Method and device for encoding/ decoding image supporting plurality of layers|
AU2013395426B2|2013-07-24|2017-11-30|Microsoft Technology Licensing, Llc|Scanning orders for non-transform coding|
WO2015015681A1|2013-07-31|2015-02-05|パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ|Image coding method, and image coding device|
EP3087743A4|2013-12-27|2017-02-22|HFI Innovation Inc.|Method and apparatus for major color index map coding|
US9854261B2|2015-01-06|2017-12-26|Microsoft Technology Licensing, Llc.|Detecting markers in an encoded video signal|
WO2016133356A1|2015-02-17|2016-08-25|엘지전자|Method and device for encoding/decoding video signal by using adaptive scan order|
EP3264771A4|2015-02-27|2018-08-29|KDDI Corporation|Coding device and decoding device|
CN107637077B|2015-05-12|2021-11-12|三星电子株式会社|Video encoding method, video decoding method, and computer-readable medium|
CA2988451C|2015-06-23|2021-01-19|Mediatek Singapore Pte. Ltd.|Method and apparatus for transform coefficient coding of non-square blocks|
KR102159252B1|2016-02-12|2020-10-14|후아웨이 테크놀러지 컴퍼니 리미티드|Method and apparatus for selecting the scan order|
KR20180129863A|2016-04-25|2018-12-05|엘지전자 주식회사|Image decoding method and apparatus in video coding system|
KR20180025283A|2016-08-31|2018-03-08|주식회사 케이티|Method and apparatus for processing a video signal|
US11102513B2|2018-12-06|2021-08-24|Tencent America LLC|One-level transform split and adaptive sub-block transform|
US11102490B2|2018-12-31|2021-08-24|Tencent America LLC|Coefficient scanning methods on adaptive angle mode|
WO2020145381A1|2019-01-13|2020-07-16|ソニー株式会社|Image processing device and image processing method|
WO2021052494A1|2019-09-21|2021-03-25|Beijing Bytedance Network Technology Co., Ltd.|Size restriction based for chroma intra mode|
法律状态:
2018-07-10| B25A| Requested transfer of rights approved|Owner name: VELOS MEDIA INTERNATIONAL LIMITED (IE) |
2018-12-18| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2020-03-31| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-04-22| B15K| Others concerning applications: alteration of classification|Free format text: A CLASSIFICACAO ANTERIOR ERA: H04N 7/26 Ipc: H04N 19/129 (2014.01), H04N 19/13 (2014.01), H04N |
2021-06-29| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2021-07-27| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 14/12/2011, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US201061426349P| true| 2010-12-22|2010-12-22|
US201061426372P| true| 2010-12-22|2010-12-22|
US61/426,372|2010-12-22|
US61/426,349|2010-12-22|
US201161436835P| true| 2011-01-27|2011-01-27|
US61/436,835|2011-01-27|
US13/179,321|US9049444B2|2010-12-22|2011-07-08|Mode dependent scanning of coefficients of a block of video data|
US13/179,321|2011-07-08|
PCT/US2011/064964|WO2012087713A1|2010-12-22|2011-12-14|Mode dependent scanning of coefficients of a block of video data|
[返回顶部]